Overview

Dataset statistics

Number of variables14
Number of observations291835
Missing cells44396
Missing cells (%)1.1%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory31.2 MiB
Average record size in memory112.0 B

Variable types

Categorical4
DateTime1
Numeric9

Alerts

VERSIE has constant value "1.0" Constant
DATUM_BESTAND has constant value "2022-04-12" Constant
PEILDATUM has constant value "2022-04-01" Constant
TYPERENDE_DIAGNOSE_CD has a high cardinality: 1770 distinct values High cardinality
BEHANDELEND_SPECIALISME_CD is highly correlated with AANTAL_PAT_PER_SPCHigh correlation
AANTAL_PAT_PER_ZPD is highly correlated with AANTAL_SUBTRAJECT_PER_ZPDHigh correlation
AANTAL_SUBTRAJECT_PER_ZPD is highly correlated with AANTAL_PAT_PER_ZPDHigh correlation
AANTAL_PAT_PER_DIAG is highly correlated with AANTAL_SUBTRAJECT_PER_DIAGHigh correlation
AANTAL_SUBTRAJECT_PER_DIAG is highly correlated with AANTAL_PAT_PER_DIAGHigh correlation
AANTAL_PAT_PER_SPC is highly correlated with BEHANDELEND_SPECIALISME_CD and 1 other fieldsHigh correlation
AANTAL_SUBTRAJECT_PER_SPC is highly correlated with AANTAL_PAT_PER_SPCHigh correlation
AANTAL_PAT_PER_ZPD is highly correlated with AANTAL_SUBTRAJECT_PER_ZPDHigh correlation
AANTAL_SUBTRAJECT_PER_ZPD is highly correlated with AANTAL_PAT_PER_ZPDHigh correlation
AANTAL_PAT_PER_DIAG is highly correlated with AANTAL_SUBTRAJECT_PER_DIAGHigh correlation
AANTAL_SUBTRAJECT_PER_DIAG is highly correlated with AANTAL_PAT_PER_DIAGHigh correlation
AANTAL_PAT_PER_SPC is highly correlated with AANTAL_SUBTRAJECT_PER_SPCHigh correlation
AANTAL_SUBTRAJECT_PER_SPC is highly correlated with AANTAL_PAT_PER_SPCHigh correlation
AANTAL_PAT_PER_ZPD is highly correlated with AANTAL_SUBTRAJECT_PER_ZPDHigh correlation
AANTAL_SUBTRAJECT_PER_ZPD is highly correlated with AANTAL_PAT_PER_ZPDHigh correlation
AANTAL_PAT_PER_DIAG is highly correlated with AANTAL_SUBTRAJECT_PER_DIAGHigh correlation
AANTAL_SUBTRAJECT_PER_DIAG is highly correlated with AANTAL_PAT_PER_DIAGHigh correlation
AANTAL_PAT_PER_SPC is highly correlated with AANTAL_SUBTRAJECT_PER_SPCHigh correlation
AANTAL_SUBTRAJECT_PER_SPC is highly correlated with AANTAL_PAT_PER_SPCHigh correlation
VERSIE is highly correlated with PEILDATUM and 1 other fieldsHigh correlation
PEILDATUM is highly correlated with VERSIE and 1 other fieldsHigh correlation
DATUM_BESTAND is highly correlated with VERSIE and 1 other fieldsHigh correlation
ZORGPRODUCT_CD is highly correlated with AANTAL_SUBTRAJECT_PER_SPCHigh correlation
AANTAL_PAT_PER_ZPD is highly correlated with AANTAL_SUBTRAJECT_PER_ZPDHigh correlation
AANTAL_SUBTRAJECT_PER_ZPD is highly correlated with AANTAL_PAT_PER_ZPDHigh correlation
AANTAL_PAT_PER_DIAG is highly correlated with AANTAL_SUBTRAJECT_PER_DIAGHigh correlation
AANTAL_SUBTRAJECT_PER_DIAG is highly correlated with AANTAL_PAT_PER_DIAGHigh correlation
AANTAL_PAT_PER_SPC is highly correlated with AANTAL_SUBTRAJECT_PER_SPCHigh correlation
AANTAL_SUBTRAJECT_PER_SPC is highly correlated with ZORGPRODUCT_CD and 1 other fieldsHigh correlation
GEMIDDELDE_VERKOOPPRIJS has 44396 (15.2%) missing values Missing
AANTAL_SUBTRAJECT_PER_ZPD is highly skewed (γ1 = 21.01716558) Skewed

Reproduction

Analysis started2022-05-02 01:52:58.833041
Analysis finished2022-05-02 01:53:22.123432
Duration23.29 seconds
Software versionpandas-profiling v3.1.1
Download configurationconfig.json

Variables

VERSIE
Categorical

CONSTANT
HIGH CORRELATION
REJECTED

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.2 MiB
1.0
291835 

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters875505
Distinct characters3
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1.0
2nd row1.0
3rd row1.0
4th row1.0
5th row1.0

Common Values

ValueCountFrequency (%)
1.0291835
100.0%

Length

2022-05-02T01:53:22.181892image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-05-02T01:53:22.277700image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
ValueCountFrequency (%)
1.0291835
100.0%

Most occurring characters

ValueCountFrequency (%)
1291835
33.3%
.291835
33.3%
0291835
33.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number583670
66.7%
Other Punctuation291835
33.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1291835
50.0%
0291835
50.0%
Other Punctuation
ValueCountFrequency (%)
.291835
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common875505
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
1291835
33.3%
.291835
33.3%
0291835
33.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII875505
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1291835
33.3%
.291835
33.3%
0291835
33.3%

DATUM_BESTAND
Categorical

CONSTANT
HIGH CORRELATION
REJECTED

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.2 MiB
2022-04-12
291835 

Length

Max length10
Median length10
Mean length10
Min length10

Characters and Unicode

Total characters2918350
Distinct characters5
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2022-04-12
2nd row2022-04-12
3rd row2022-04-12
4th row2022-04-12
5th row2022-04-12

Common Values

ValueCountFrequency (%)
2022-04-12291835
100.0%

Length

2022-05-02T01:53:22.355723image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-05-02T01:53:22.449154image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
ValueCountFrequency (%)
2022-04-12291835
100.0%

Most occurring characters

ValueCountFrequency (%)
21167340
40.0%
0583670
20.0%
-583670
20.0%
4291835
 
10.0%
1291835
 
10.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number2334680
80.0%
Dash Punctuation583670
 
20.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
21167340
50.0%
0583670
25.0%
4291835
 
12.5%
1291835
 
12.5%
Dash Punctuation
ValueCountFrequency (%)
-583670
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common2918350
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
21167340
40.0%
0583670
20.0%
-583670
20.0%
4291835
 
10.0%
1291835
 
10.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII2918350
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
21167340
40.0%
0583670
20.0%
-583670
20.0%
4291835
 
10.0%
1291835
 
10.0%

PEILDATUM
Categorical

CONSTANT
HIGH CORRELATION
REJECTED

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.2 MiB
2022-04-01
291835 

Length

Max length10
Median length10
Mean length10
Min length10

Characters and Unicode

Total characters2918350
Distinct characters5
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2022-04-01
2nd row2022-04-01
3rd row2022-04-01
4th row2022-04-01
5th row2022-04-01

Common Values

ValueCountFrequency (%)
2022-04-01291835
100.0%

Length

2022-05-02T01:53:22.526575image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-05-02T01:53:22.620140image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
ValueCountFrequency (%)
2022-04-01291835
100.0%

Most occurring characters

ValueCountFrequency (%)
2875505
30.0%
0875505
30.0%
-583670
20.0%
4291835
 
10.0%
1291835
 
10.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number2334680
80.0%
Dash Punctuation583670
 
20.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
2875505
37.5%
0875505
37.5%
4291835
 
12.5%
1291835
 
12.5%
Dash Punctuation
ValueCountFrequency (%)
-583670
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common2918350
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
2875505
30.0%
0875505
30.0%
-583670
20.0%
4291835
 
10.0%
1291835
 
10.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII2918350
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
2875505
30.0%
0875505
30.0%
-583670
20.0%
4291835
 
10.0%
1291835
 
10.0%

JAAR
Date

Distinct11
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.2 MiB
Minimum2012-01-01 00:00:00
Maximum2022-01-01 00:00:00
2022-05-02T01:53:22.689556image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-02T01:53:22.774628image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=11)

BEHANDELEND_SPECIALISME_CD
Real number (ℝ≥0)

HIGH CORRELATION

Distinct27
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean423.5461991
Minimum301
Maximum8418
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size2.2 MiB
2022-05-02T01:53:22.885814image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum301
5-th percentile302
Q1305
median313
Q3322
95-th percentile335
Maximum8418
Range8117
Interquartile range (IQR)17

Descriptive statistics

Standard deviation928.9834954
Coefficient of variation (CV)2.193346316
Kurtosis69.92118548
Mean423.5461991
Median Absolute Deviation (MAD)8
Skewness8.47403304
Sum123605605
Variance863010.3347
MonotonicityNot monotonic
2022-05-02T01:53:22.994051image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=27)
ValueCountFrequency (%)
30541476
14.2%
31337733
12.9%
30333646
11.5%
33023226
 
8.0%
31619881
 
6.8%
30815571
 
5.3%
30612204
 
4.2%
32412090
 
4.1%
30111721
 
4.0%
3049477
 
3.2%
Other values (17)74810
25.6%
ValueCountFrequency (%)
30111721
 
4.0%
3026371
 
2.2%
30333646
11.5%
3049477
 
3.2%
30541476
14.2%
30612204
 
4.2%
3075053
 
1.7%
30815571
 
5.3%
3103256
 
1.1%
31337733
12.9%
ValueCountFrequency (%)
84183880
 
1.3%
1900190
 
0.1%
390765
 
0.3%
3893118
 
1.1%
3624140
 
1.4%
3612084
 
0.7%
3352961
 
1.0%
33023226
8.0%
329759
 
0.3%
3286354
 
2.2%

TYPERENDE_DIAGNOSE_CD
Categorical

HIGH CARDINALITY

Distinct1770
Distinct (%)0.6%
Missing0
Missing (%)0.0%
Memory size2.2 MiB
101
 
1236
402
 
1193
403
 
1164
301
 
1162
203
 
1100
Other values (1765)
285980 

Length

Max length4
Median length3
Mean length3.351479432
Min length2

Characters and Unicode

Total characters978079
Distinct characters25
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2 ?
Unique (%)< 0.1%

Sample

1st row0215
2nd row0114
3rd row0316
4th row0218
5th row0611

Common Values

ValueCountFrequency (%)
1011236
 
0.4%
4021193
 
0.4%
4031164
 
0.4%
3011162
 
0.4%
2031100
 
0.4%
2011092
 
0.4%
401973
 
0.3%
404967
 
0.3%
802953
 
0.3%
409939
 
0.3%
Other values (1760)281056
96.3%

Length

2022-05-02T01:53:23.117813image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
1011236
 
0.4%
4021193
 
0.4%
4031164
 
0.4%
3011162
 
0.4%
2031100
 
0.4%
2011092
 
0.4%
401973
 
0.3%
404967
 
0.3%
802953
 
0.3%
409939
 
0.3%
Other values (1760)281056
96.3%

Most occurring characters

ValueCountFrequency (%)
1187421
19.2%
0178766
18.3%
2129734
13.3%
3106029
10.8%
575348
7.7%
970560
 
7.2%
469530
 
7.1%
757616
 
5.9%
651166
 
5.2%
842043
 
4.3%
Other values (15)9866
 
1.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number968213
99.0%
Uppercase Letter9866
 
1.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
G1838
18.6%
M1650
16.7%
B1184
12.0%
E839
8.5%
Z807
8.2%
D670
 
6.8%
A640
 
6.5%
F620
 
6.3%
C331
 
3.4%
K318
 
3.2%
Other values (5)969
9.8%
Decimal Number
ValueCountFrequency (%)
1187421
19.4%
0178766
18.5%
2129734
13.4%
3106029
11.0%
575348
7.8%
970560
 
7.3%
469530
 
7.2%
757616
 
6.0%
651166
 
5.3%
842043
 
4.3%

Most occurring scripts

ValueCountFrequency (%)
Common968213
99.0%
Latin9866
 
1.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
G1838
18.6%
M1650
16.7%
B1184
12.0%
E839
8.5%
Z807
8.2%
D670
 
6.8%
A640
 
6.5%
F620
 
6.3%
C331
 
3.4%
K318
 
3.2%
Other values (5)969
9.8%
Common
ValueCountFrequency (%)
1187421
19.4%
0178766
18.5%
2129734
13.4%
3106029
11.0%
575348
7.8%
970560
 
7.3%
469530
 
7.2%
757616
 
6.0%
651166
 
5.3%
842043
 
4.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII978079
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1187421
19.2%
0178766
18.3%
2129734
13.3%
3106029
10.8%
575348
7.7%
970560
 
7.2%
469530
 
7.1%
757616
 
5.9%
651166
 
5.2%
842043
 
4.3%
Other values (15)9866
 
1.0%

ZORGPRODUCT_CD
Real number (ℝ≥0)

HIGH CORRELATION

Distinct5965
Distinct (%)2.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean442329517.3
Minimum10501002
Maximum998418081
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size2.2 MiB
2022-05-02T01:53:23.246217image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum10501002
5-th percentile28999038
Q199899003
median149899003
Q3990004004
95-th percentile990516027
Maximum998418081
Range987917079
Interquartile range (IQR)890105001

Descriptive statistics

Standard deviation429418787.8
Coefficient of variation (CV)0.9708119647
Kurtosis-1.744660847
Mean442329517.3
Median Absolute Deviation (MAD)119999998
Skewness0.460012133
Sum1.290872347 × 1014
Variance1.844004953 × 1017
MonotonicityNot monotonic
2022-05-02T01:53:23.381845image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
9900040092114
 
0.7%
9900040072083
 
0.7%
9900030042065
 
0.7%
9900040061698
 
0.6%
9903560761522
 
0.5%
9903560731409
 
0.5%
9900030071340
 
0.5%
1319992281274
 
0.4%
1319991641261
 
0.4%
1992990131209
 
0.4%
Other values (5955)275860
94.5%
ValueCountFrequency (%)
105010028
< 0.1%
1050100310
< 0.1%
1050100410
< 0.1%
1050100511
< 0.1%
105010073
 
< 0.1%
1050100811
< 0.1%
1050101010
< 0.1%
105010113
 
< 0.1%
111010029
< 0.1%
1110100310
< 0.1%
ValueCountFrequency (%)
998418081144
< 0.1%
998418080128
< 0.1%
99841807935
 
< 0.1%
9984180777
 
< 0.1%
9984180767
 
< 0.1%
9984180756
 
< 0.1%
998418074188
0.1%
998418073188
0.1%
9984180727
 
< 0.1%
9984180717
 
< 0.1%

AANTAL_PAT_PER_ZPD
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct9705
Distinct (%)3.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean522.6910789
Minimum1
Maximum164654
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size2.2 MiB
2022-05-02T01:53:23.517812image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q13
median14
Q3106
95-th percentile1783.3
Maximum164654
Range164653
Interquartile range (IQR)103

Descriptive statistics

Standard deviation3201.824143
Coefficient of variation (CV)6.125652938
Kurtosis388.5361534
Mean522.6910789
Median Absolute Deviation (MAD)13
Skewness16.3777953
Sum152539551
Variance10251677.84
MonotonicityNot monotonic
2022-05-02T01:53:23.775046image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
147806
 
16.4%
223445
 
8.0%
315283
 
5.2%
411275
 
3.9%
58854
 
3.0%
67387
 
2.5%
76238
 
2.1%
85255
 
1.8%
94818
 
1.7%
104241
 
1.5%
Other values (9695)157233
53.9%
ValueCountFrequency (%)
147806
16.4%
223445
8.0%
315283
 
5.2%
411275
 
3.9%
58854
 
3.0%
67387
 
2.5%
76238
 
2.1%
85255
 
1.8%
94818
 
1.7%
104241
 
1.5%
ValueCountFrequency (%)
1646541
< 0.1%
1558841
< 0.1%
1542701
< 0.1%
1512861
< 0.1%
1447251
< 0.1%
1192811
< 0.1%
1180401
< 0.1%
1159411
< 0.1%
1105201
< 0.1%
1096751
< 0.1%

AANTAL_SUBTRAJECT_PER_ZPD
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
SKEWED

Distinct10368
Distinct (%)3.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean615.7620882
Minimum1
Maximum239919
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size2.2 MiB
2022-05-02T01:53:23.908405image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q13
median15
Q3116
95-th percentile2026
Maximum239919
Range239918
Interquartile range (IQR)113

Descriptive statistics

Standard deviation4101.048618
Coefficient of variation (CV)6.660118732
Kurtosis704.0325055
Mean615.7620882
Median Absolute Deviation (MAD)14
Skewness21.01716558
Sum179700929
Variance16818599.77
MonotonicityNot monotonic
2022-05-02T01:53:24.042503image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
146020
 
15.8%
223037
 
7.9%
315141
 
5.2%
411042
 
3.8%
58792
 
3.0%
67381
 
2.5%
76207
 
2.1%
85188
 
1.8%
94751
 
1.6%
104240
 
1.5%
Other values (10358)160036
54.8%
ValueCountFrequency (%)
146020
15.8%
223037
7.9%
315141
 
5.2%
411042
 
3.8%
58792
 
3.0%
67381
 
2.5%
76207
 
2.1%
85188
 
1.8%
94751
 
1.6%
104240
 
1.5%
ValueCountFrequency (%)
2399191
< 0.1%
2324311
< 0.1%
2321181
< 0.1%
2280471
< 0.1%
2276061
< 0.1%
2267761
< 0.1%
2240991
< 0.1%
2186231
< 0.1%
2142311
< 0.1%
2047691
< 0.1%

AANTAL_PAT_PER_DIAG
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct8615
Distinct (%)3.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean7853.93679
Minimum1
Maximum227540
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size2.2 MiB
2022-05-02T01:53:24.174130image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile44
Q1425
median1785
Q36606
95-th percentile37371
Maximum227540
Range227539
Interquartile range (IQR)6181

Descriptive statistics

Standard deviation18022.98007
Coefficient of variation (CV)2.294770197
Kurtosis32.85159273
Mean7853.93679
Median Absolute Deviation (MAD)1618
Skewness4.979202913
Sum2292053643
Variance324827810.6
MonotonicityNot monotonic
2022-05-02T01:53:24.306953image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
21455
 
0.2%
25437
 
0.1%
9430
 
0.1%
8417
 
0.1%
14408
 
0.1%
1395
 
0.1%
26392
 
0.1%
37388
 
0.1%
6385
 
0.1%
19379
 
0.1%
Other values (8605)287749
98.6%
ValueCountFrequency (%)
1395
0.1%
2359
0.1%
3313
0.1%
4367
0.1%
5311
0.1%
6385
0.1%
7340
0.1%
8417
0.1%
9430
0.1%
10273
0.1%
ValueCountFrequency (%)
22754023
< 0.1%
21380224
< 0.1%
21375417
< 0.1%
21353825
< 0.1%
21159917
< 0.1%
21043719
< 0.1%
20535117
< 0.1%
20060516
< 0.1%
19853020
< 0.1%
18910919
< 0.1%

AANTAL_SUBTRAJECT_PER_DIAG
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct9499
Distinct (%)3.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean11250.18617
Minimum1
Maximum368507
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size2.2 MiB
2022-05-02T01:53:24.476708image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile56
Q1562
median2458
Q39299
95-th percentile53035
Maximum368507
Range368506
Interquartile range (IQR)8737

Descriptive statistics

Standard deviation26658.00318
Coefficient of variation (CV)2.369561069
Kurtosis36.47929954
Mean11250.18617
Median Absolute Deviation (MAD)2248
Skewness5.227601553
Sum3283198082
Variance710649133.5
MonotonicityNot monotonic
2022-05-02T01:53:24.608265image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
33339
 
0.1%
1320
 
0.1%
10319
 
0.1%
77317
 
0.1%
17316
 
0.1%
25311
 
0.1%
57307
 
0.1%
6306
 
0.1%
46303
 
0.1%
38303
 
0.1%
Other values (9489)288694
98.9%
ValueCountFrequency (%)
1320
0.1%
2296
0.1%
3264
0.1%
4273
0.1%
5266
0.1%
6306
0.1%
7272
0.1%
8272
0.1%
9246
0.1%
10319
0.1%
ValueCountFrequency (%)
36850723
< 0.1%
34852625
< 0.1%
34169519
< 0.1%
33599924
< 0.1%
32379220
< 0.1%
31467417
< 0.1%
31078017
< 0.1%
29865317
< 0.1%
28904716
< 0.1%
27455917
< 0.1%

AANTAL_PAT_PER_SPC
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct287
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean685775.8265
Minimum2
Maximum1489487
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size2.2 MiB
2022-05-02T01:53:24.750900image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum2
5-th percentile43789
Q1350206
median753965
Q31017453
95-th percentile1336144
Maximum1489487
Range1489485
Interquartile range (IQR)667247

Descriptive statistics

Standard deviation410261.8175
Coefficient of variation (CV)0.5982447932
Kurtosis-1.064700592
Mean685775.8265
Median Absolute Deviation (MAD)309630
Skewness-0.04365978264
Sum2.001333883 × 1011
Variance1.683147589 × 1011
MonotonicityNot monotonic
2022-05-02T01:53:24.885254image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
8809605102
 
1.7%
8742664354
 
1.5%
8439914348
 
1.5%
8943944333
 
1.5%
8805554273
 
1.5%
8949334212
 
1.4%
7539654083
 
1.4%
10840633890
 
1.3%
11010593864
 
1.3%
10635953851
 
1.3%
Other values (277)249525
85.5%
ValueCountFrequency (%)
21
 
< 0.1%
47
 
< 0.1%
64
 
< 0.1%
75
 
< 0.1%
106
 
< 0.1%
1214
< 0.1%
174
 
< 0.1%
2115
< 0.1%
226
 
< 0.1%
2419
< 0.1%
ValueCountFrequency (%)
14894872976
1.0%
14506103054
1.0%
14218203564
1.2%
13451873543
1.2%
13361443439
1.2%
13328583546
1.2%
13173333463
1.2%
12967141181
 
0.4%
12830653577
1.2%
12625811201
 
0.4%

AANTAL_SUBTRAJECT_PER_SPC
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct288
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1101307.53
Minimum2
Maximum2666725
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size2.2 MiB
2022-05-02T01:53:25.027854image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum2
5-th percentile49801
Q1520135
median1091367
Q31813055
95-th percentile2557581
Maximum2666725
Range2666723
Interquartile range (IQR)1292920

Descriptive statistics

Standard deviation727351.4007
Coefficient of variation (CV)0.6604435009
Kurtosis-0.8156803321
Mean1101307.53
Median Absolute Deviation (MAD)627405
Skewness0.3061909623
Sum3.21400083 × 1011
Variance5.290400601 × 1011
MonotonicityNot monotonic
2022-05-02T01:53:25.166661image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
12118135102
 
1.7%
12817424354
 
1.5%
12162944348
 
1.5%
13157164333
 
1.5%
13006184273
 
1.5%
13364724212
 
1.4%
11352544083
 
1.4%
25575813890
 
1.3%
26667253864
 
1.3%
24882833851
 
1.3%
Other values (278)249525
85.5%
ValueCountFrequency (%)
21
 
< 0.1%
47
< 0.1%
64
 
< 0.1%
85
 
< 0.1%
106
 
< 0.1%
1314
< 0.1%
174
 
< 0.1%
2115
< 0.1%
226
 
< 0.1%
253
 
< 0.1%
ValueCountFrequency (%)
26667253864
1.3%
26033603845
1.3%
25740803769
1.3%
25575813890
1.3%
24882833851
1.3%
21841583757
1.3%
20662293810
1.3%
20450001169
 
0.4%
19903051167
 
0.4%
19784273691
1.3%

GEMIDDELDE_VERKOOPPRIJS
Real number (ℝ≥0)

MISSING

Distinct3382
Distinct (%)1.4%
Missing44396
Missing (%)15.2%
Infinite0
Infinite (%)0.0%
Mean3560.434531
Minimum70
Maximum287220
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size2.2 MiB
2022-05-02T01:53:25.304190image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum70
5-th percentile140
Q1475
median1255
Q34145
95-th percentile13455
Maximum287220
Range287150
Interquartile range (IQR)3670

Descriptive statistics

Standard deviation6554.232076
Coefficient of variation (CV)1.840851733
Kurtosis155.9786254
Mean3560.434531
Median Absolute Deviation (MAD)1020
Skewness7.460251504
Sum880990360
Variance42957958.1
MonotonicityNot monotonic
2022-05-02T01:53:25.433743image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1601842
 
0.6%
1051827
 
0.6%
1101771
 
0.6%
1451375
 
0.5%
1801373
 
0.5%
3001289
 
0.4%
1651258
 
0.4%
1251251
 
0.4%
1851235
 
0.4%
1401222
 
0.4%
Other values (3372)232996
79.8%
(Missing)44396
 
15.2%
ValueCountFrequency (%)
70226
 
0.1%
7575
 
< 0.1%
80362
 
0.1%
85917
0.3%
90609
 
0.2%
95673
 
0.2%
100887
0.3%
1051827
0.6%
1101771
0.6%
115872
0.3%
ValueCountFrequency (%)
2872208
< 0.1%
1489103
 
< 0.1%
1428354
< 0.1%
1221554
< 0.1%
1167653
 
< 0.1%
1097257
< 0.1%
1085707
< 0.1%
1076554
< 0.1%
1012708
< 0.1%
954657
< 0.1%

Interactions

2022-05-02T01:53:18.978783image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-02T01:53:06.820418image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-02T01:53:08.484276image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-02T01:53:09.935742image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-02T01:53:11.373315image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-02T01:53:12.949893image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-02T01:53:14.385965image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-02T01:53:15.880250image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-02T01:53:17.479807image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-02T01:53:19.182817image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-02T01:53:06.998586image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-02T01:53:08.652009image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-02T01:53:10.104630image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-02T01:53:11.537742image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-02T01:53:13.116332image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-02T01:53:14.558694image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-02T01:53:16.184627image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-02T01:53:17.646265image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-02T01:53:19.338797image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-02T01:53:07.163402image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-02T01:53:08.810499image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-02T01:53:10.261737image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-02T01:53:11.692672image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-02T01:53:13.273144image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-02T01:53:14.723191image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-02T01:53:16.344202image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-02T01:53:17.803953image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-02T01:53:19.496050image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-02T01:53:07.336448image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-02T01:53:08.970390image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-02T01:53:10.420248image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-02T01:53:11.847646image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-02T01:53:13.431178image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-02T01:53:14.888559image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-02T01:53:16.506229image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-02T01:53:17.962581image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-02T01:53:19.652699image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-02T01:53:07.501506image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-02T01:53:09.128306image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-02T01:53:10.576471image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-02T01:53:12.151713image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-02T01:53:13.587477image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-02T01:53:15.052121image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-02T01:53:16.667576image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-02T01:53:18.120997image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-02T01:53:19.804832image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-02T01:53:07.661069image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-02T01:53:09.283229image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-02T01:53:10.728953image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-02T01:53:12.303239image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-02T01:53:13.739683image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-02T01:53:15.210622image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-02T01:53:16.824429image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-02T01:53:18.286810image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-02T01:53:19.967933image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-02T01:53:07.831918image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-02T01:53:09.449150image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-02T01:53:10.894266image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-02T01:53:12.467233image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-02T01:53:13.903718image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-02T01:53:15.381086image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-02T01:53:16.990154image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-02T01:53:18.450144image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-02T01:53:20.266530image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-02T01:53:08.143333image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-02T01:53:09.617502image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-02T01:53:11.058715image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-02T01:53:12.632512image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-02T01:53:14.068396image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-02T01:53:15.551740image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-02T01:53:17.157913image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-02T01:53:18.612292image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-02T01:53:20.423182image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-02T01:53:08.313976image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-02T01:53:09.773511image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-02T01:53:11.214227image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-02T01:53:12.787308image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-02T01:53:14.222230image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-02T01:53:15.717136image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-02T01:53:17.316580image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-02T01:53:18.793285image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Correlations

2022-05-02T01:53:25.556378image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2022-05-02T01:53:25.871247image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2022-05-02T01:53:26.054229image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2022-05-02T01:53:26.225629image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Cramér's V (φc)

Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.
2022-05-02T01:53:26.338111image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.

Missing values

2022-05-02T01:53:20.722960image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
A simple visualization of nullity by column.
2022-05-02T01:53:21.268091image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2022-05-02T01:53:21.917171image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
The dendrogram allows you to more fully correlate variable completion, revealing trends deeper than the pairwise ones visible in the correlation heatmap.

Sample

First rows

VERSIEDATUM_BESTANDPEILDATUMJAARBEHANDELEND_SPECIALISME_CDTYPERENDE_DIAGNOSE_CDZORGPRODUCT_CDAANTAL_PAT_PER_ZPDAANTAL_SUBTRAJECT_PER_ZPDAANTAL_PAT_PER_DIAGAANTAL_SUBTRAJECT_PER_DIAGAANTAL_PAT_PER_SPCAANTAL_SUBTRAJECT_PER_SPCGEMIDDELDE_VERKOOPPRIJS
01.02022-04-122022-04-012012-01-013270215990027054111152153618220924007458975.0
11.02022-04-122022-04-012012-01-013270114990027044111598418633182209240074NaN
21.02022-04-122022-04-012012-01-01327031699002702144128417341822092400746840.0
31.02022-04-122022-04-012012-01-01327021899002701918212513031822092400743575.0
41.02022-04-122022-04-012012-01-0132706119900270651116171725182209240074NaN
51.02022-04-122022-04-012012-01-013270613990027027396399331637831822092400748615.0
61.02022-04-122022-04-012012-01-013270616990027041333184405918220924007420210.0
71.02022-04-122022-04-012012-01-0132701129900270131451469651137182209240074465.0
81.02022-04-122022-04-012012-01-0132702119900270132278118182209240074465.0
91.02022-04-122022-04-012012-01-0132706159900270141381498599781822092400741385.0

Last rows

VERSIEDATUM_BESTANDPEILDATUMJAARBEHANDELEND_SPECIALISME_CDTYPERENDE_DIAGNOSE_CDZORGPRODUCT_CDAANTAL_PAT_PER_ZPDAANTAL_SUBTRAJECT_PER_ZPDAANTAL_PAT_PER_DIAGAANTAL_SUBTRAJECT_PER_DIAGAANTAL_PAT_PER_SPCAANTAL_SUBTRAJECT_PER_SPCGEMIDDELDE_VERKOOPPRIJS
2918251.02022-04-122022-04-012022-01-01327031799002719911339393NaN
2918261.02022-04-122022-04-012022-01-01327061599002719911119393NaN
2918271.02022-04-122022-04-012022-01-01327041599002719877999393NaN
2918281.02022-04-122022-04-012022-01-01327011599002719933449393NaN
2918291.02022-04-122022-04-012022-01-01327061699002719955559393NaN
2918301.02022-04-122022-04-012022-01-0132707139900271984427279393NaN
2918311.02022-04-122022-04-012022-01-01327011599002719811449393NaN
2918321.02022-04-122022-04-012022-01-013270116990027199131313139393NaN
2918331.02022-04-122022-04-012022-01-01327041599002719922999393NaN
2918341.02022-04-122022-04-012022-01-01327011799002719922229393NaN